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I. INTRODUCTION 

Several years ago, Chris Jarzynski and one of us (DM) 
introduced a solvable model of a thermodynamic ratchet 
that leveraged information to convert thermal energy to 
work Si- Our hope was to give a new level of under¬ 
standing of the Second Law of Thermodynamics and one 
of its longest-lived counterexamples—Maxwell’s Demon. 
As it reads in “bits” from an input string Y, a detailed- 
balance stochastic multistate controller raises or lowers a 
mass against gravity, writing “exhaust” bits to an output 
string Y'. 

A complete understanding of the ratchet’s thermody¬ 
namics requires exactly accounting for all of the informa¬ 
tion embedded the input and output strings and how that 
information is changed by the ratchet. To simplify, we as¬ 
sumed the input bits came from a biased coin and so the 
input information could be measured using the single-bit 
Shannon entropy H]!))]- The information in the output 
string was much more challenging to quantify, since cor¬ 
relations are necessarily introduced by the action of the 
memoryful ratchet. Unfortunately, due to mathematical 
complications arising from this, we could only estimate 
the single-bit entropy of the output. Which, it 

must be said, is only an upper bound on the actual infor¬ 
mation per output bit. Nonetheless, the estimate of the 
change A H = H[yg] — H[yo] from input to output was 
good enough to show that the ratchet was quite func¬ 
tional, operating as an “engine” in some regimes and an 
“eraser” in others. 

Following in this spirit, the three of us here recently in¬ 
troduced a similar memoryful ratchet for which all of the 
informational correlations in the output bit string can be 
calculated exactly and in closed form Q. As a result, 
one of its contributions is that we could then show that 
the change dh^ = /i^[y'] — in the Shannon entropy 

rate h^\X] = \\v[\i^ac,^[XoXi ... X(\/ f allowed one to 
identify all of the ratchet’s thermodynamic functionality. 
We emphasized, in particular, that using single-bit Shan¬ 
non entropy A H would miss much of that functionality, 
as H[yQ'] > h^\Y'\. And, as such, we generalized Refs. 
[1,@ single-bit A H “Second Law” to use the Shannon en- 
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tropy rate The underlying methods leveraged a new 
way to account for the information storara and transfor¬ 
mation induced by memoryful channels [^. A similarly 
complete analytical treatment of a companion Demon— 
Szilard’s E^ine—was recently given by two of us (AB 
and JPG) 0. 


II. SPECIAL CASE OF THE MEMORYLESS 
TRANSDUCER 

A recent arXiv post 0 complained that our work 0 is 
misleading in certain aspects. It also claims priority over 
our entropy-rate Second Law 0 , Eq. (4)], stating that 
Eq. (24) of Ref. 0 is the same. This is mathematically 
incorrect. Moreover, our Ref. 0 is very clear about its 
contributions. In short, our treatment is more general, 
since it considers the much broader class of Demons with 
arbitrary memory. Such Demons, as we describe in our 
manuscript, can be represented as memoryful channels, 
otherwise known as transducers 0. In stark contrast, 
Ref. 0 ’s treatment is sufficient only for describing mem¬ 
oryless channels; a highly restricted, markedly simpler 
case. More to the point, its methods are inapplicable 
to our memoryful channel setup. This error occurs in 
the proof of Ref. 0’s Eq. (24) as it contains a state¬ 
ment that can be violated by memoryful channels. We 
provide two counterexamples to this erroneous statement 
in our response below. Finally, the case of memoryless 
Demons violates the spirit of Refs. 00’s original work. 
The mathematical errors and misinterpretation of phys¬ 
ical relevance subvert the arXiv post’s claims. We now 
turn to respond to its three specific comments in greater 
detail. 


Comment 1 

In his first comment, the arXiv post’s author mentions 
that the following sentences in our paper give a “very 
strong misleading impression that the paper above is the 
first to incorporate correlations successfully in general” 
(using his own words). This is simple misreading, as our 
text makes clear: 

We introduce a family of Maxwellian Demons 
for which correlations among information 
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bearing degrees of freedom can be calcu¬ 
lated exactly and in compact analytical form. 

This allows one to precisely determine Demon 
functional thermodynamic operating regimes, 
when previous methods either misclassify or 
simply fail due to the approximations they 
invoke. 

Note that this explicitly mentions the solvable aspect of 
our model—that correlations can be calculated exactly in 
a compact, analytical form. We stand by the claim that 
ours is the first such solvable model. We did not claim to 
be the first to consider correlations. More pointedly, the 
author’s actual article Q does not have any model with 
caleulable correlations. We justify the second sentence 
quoted above on identifying functional thermodynamics 
through explicit calculations and diagrams in Sec. V of 
our paper. The arXiv post ignores these. 

The author claims that the “main result in Section 4 
of [1 ] was exactly the same as the above mentioned upper 
bound on the extracted work in terms of the change in 
the joint entropy ... .” In this, he refers to Eq. (4) 

of our paper and claims that he had derived it before 
as Eq. (24) of his paper While we agree that our 
equation superficially looks like the infinite-time limit of 
the author’s equation, their relationship is different than 
a glance suggests: 

• The author’s proof of Eq. (24) Q does not apply 
to our setup. This is because the author consid¬ 
ered the much simpler case of memoryless channels, 
whereas we considered the much more mathemati¬ 
cally challenging case of memoryful channels. (We 
return to this point again in context of the 3'^'^ com¬ 
ment.) 

• Appendix A in our paper clearly shows that Eq. (4) 
there is valid only in the asymptotic limit of sta¬ 
tionary input bits for a finite-state Demon. (These 
are standard assumptions in the field.) In absence 
of these assumptions, we have a more general form 
of the Second Law discussed in detail in Appendix 
A 0. The arXiv post neglects these discussions. 

Comment 2 

The arXiv post quotes the following from our paper: 

In effect, they account for Demon 
information-processing by replacing the 
Shannon information of the components 
as a whole by the sum of the components’ 
individual Shannon informations. Since the 
latter is larger than the former [19], these 
analyses lead to weak bounds on the Demon 
performance. 

And, then goes on to claim that the second assertion may 
not be true if the incoming bits {Yi} are correlated. This 


is the case in the author’s Ref. where the sum of indi¬ 
vidual entropy differences is actually stronger, under the 
additional assumption that the Demon is memoryless. 
We agree. But, as the author himself points out, our 
claim is true if the incoming bits are uncorrelated. We 
explicitly state that we are considering this case, where 
the input is uncorrelated, in the paragraph following Eq. 
(4). And, this happens to be the case for all the exactly 
solvable models of Maxwell’s Demon developed so far (re¬ 
ferred to by “they” in the above quote). (We reiterate, 
the author has not given any exactly solvable model of 
Maxwell’s Demon with calculable correlations in Q.) 

The author did not sufficiently consider the remain¬ 
der of our development before expressing his criticism in 
public. After Eq. (4), we explicitly mention the sufficient 
condition of uncorrelated incoming bits for Eq. (4) to be 
stronger than Eq. (2). 

According to the author “the point in second law and 
its extensions ... should be to provide, first and fore¬ 
most, an extended version of the second law in a faithful 
manner, namely, to show the increase of the real entropy 
of the entire system, including that of the information 
reservoir. In the correlated case, the latter is given by 
the change in the joint entropy of the symbols, regardless 
of whether or not this is smaller or larger than the sum 
of individual entropy differences. ” We disagree. This is 
nothing more than an attempt to rewrite the history of 
physics. 

The primary emphasis of the Second Law from its very 
inception has been on the strongest possible bounds. 
When Sadi Carnot formulated the Second Law, it was 
all about maximum efficiency of heat engines—the max¬ 
imum possible work that can be extracted 0- Entropy 
was a derived concept, entering through the works of 
Clausius and Thompson 0. 

The author mentions that “bounds are useful when they 
are easier to calculate than the real quantity of interest, 
which is not quite the case in this context. Quite the 
contrary, joint entropies (especially of long blocks) are 
much harder to calculate. ” He fails to notice that we at¬ 
tained precisely this “hard” task by calculating exactly 
the entropy rate h^j,\Y'] = lim^_>oo (We might, 

at this point, recommend the review of correlations and 
information in random-variable blocks presented by Ref. 

And, the entropy rate is smaller than the individ¬ 
ual entropy difference, which in our case has observable 
consequences, as discussed in detail in Sec. V of our 
paper. Even the later part of his comment “work itself 
... depends only on the input and output marginals” is 
not true in a generic memoryful situation. We have ex¬ 
plicit examples (unpublished) where the extracted work 
also depends on correlations. 


Comment 3 

Here, the author claims that the “bound in [57] ... is 
exactly the same as in eq. (4) of 1507.01537v2, except 
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that in [57], no limit on is taken over the normalized 
entropies (but this is because even stationarity is not as¬ 
sumed there, so the limit might not exist). Moreover, 
while it is true that in the model of [57] the channel was 
memoryless, the derivation itself of this very same bound 
(in Seetion 4 of [57]) was not sensitive to the channel 
memorylessness assumption. ” (Citation [57] corresponds 
to Ref. Q here.) We agree that Eq. (4) in our paper ap¬ 
peared in a somewhat different form than in his paper, as 
Eq. (24). This is moot, however. His derivation does not 
apply to our case nor to the original solvable Maxwell’s 
demon [l|. In his justification, the author says that the 
“crucial step in [57] ... was the equality 


H(r/|yi,..., r,_i, r/,..., yU) = H(r/|ri,..., r,_i) 


( 1 ) 


which is the case when 


r/^(ri,...,r,_i)^(r/,...,y/_i) 


( 2 ) 


is: 


This inequality, replacing Eq. o above, renders the 
proof in the author’s paper inapplicable to our situation. 

This reflects the fact that a memoryful ratchet can and 
typically does create correlations among the outgoing bits 
even though the incoming bits may not be correlated. In 
fact, we can exactly calculate the uncertainty in the next 
output bit conditioned on the infinite length input and 
output histories of the memoryful ratchet we describe in 
our Ref. Q. When the ratchet is driven by a fair coin 
input process, the two quantities of interest are: 

i»HKin.l = I(H(|)+H(|)) , 

where H(&) is the binary entropy function for a coin of 
bias b (l3| . and: 


forms a Markov chain, and this happens not only for a 
memoryless channel, but for any causal channel without 
feedback, namely. 


hm R[Y[\Y,,„Y[,,] = i (H (p) + H (q )), 
and their difference: 


PiYl,...,Y[\Y,,...,Yr,) = IV[^,PiY(\Y^,...,Y,) . (3) 

In physical terms, this actually means full generality. ” 

This analysis is incorrect. Equation o above is not 
sufficiently general, since it does not consider the case in 
which the Demon is a memoryful channel. When the De¬ 
mon has memory, its internal state can depend on both 
the input past Yi,..., and output past Y],..., Yl_^. 
The Demon’s internal states store information about the 
past of Y or Y' and can communicate it to the outgoing 
bits of Y'. The author’s assertion of “full generality” is 
false. In fact, the memoryless assumption is violated for 
the original solvable model of Maxwell’s Demon [I| in 
which the Demon has three internal states. For a mem¬ 
oryful Demon Eq. © above is not a Markov chain. See 
Ref. 0’s discussion of memoryful transduction. 

To see how Eq. o can be violated, consider the case 
of a memoryful Demon that simply ignores the input bits 
and outputs a period-2 process. This means that there 
are two possible output words: 


hm (H[F/|ri,i]-H[F/|yi,„r/J) 

l—¥00 



>0 , 


by the concavity of H(-). This is only zero when p=q=0. 
Thus, the assumption made in Eq. (I) is not just insuffi¬ 
ciently general, but it is explicitly violated in the physical 
memoryful ratchet considered in our work. 

On a more conceptual level, Eq. (4) in our paper is 
valid only in the asymptotic limit of a stationary in¬ 
put with a finite-state Demon. Otherwise, there would 
be natural generalizations incorporating the Demon’s en¬ 
tropy and its correlations with the bits, as is amply dis¬ 
cussed in Appendix A of our paper. 


III. SUMMING UP 


Pr(r/r2'- = OIOIOI...) = Pi{Y[Y[... = 101010...) = 1/2 . 

In this case the uncertainty of the fth output given the 
history of inputs is H[y/|Yi:i] = 1, since we are com¬ 
pletely uncertain as to whether or not the ith bit is a zero 
or one. Note that we used the notational shorthand Yi.,i 
to represent the random variables Yi, I 2 , • ■ • Pi-i- When 
we also condition on the history of output bits, we find 
that we are completely certain of the next bit, since we 
know the output’s phase, and H[l/'|Yi.i, Y/.J = 0. The 
most general relation for the uncertainty of the output 


As our response to the arXiv post’s Comment 3 just 
made plain, the essential issue reduces to the post’s au¬ 
thor misapplying results for memory less channels. Most 
directly, the post’s claim to priority for our entropy-rate 
Second Law is invalid. Perhaps the simple memoryless 
channel case, one very broadly adopted in elementary in¬ 
formation theory [13| . prevented the post’s author from 
appreciating this and related technical points. Whatever 
the motivation, it led to the post’s public airing of a series 
of grievances—grievances that derive not from mislead¬ 
ing text, but from the author’s misinterpretation. That 
said, we do appreciate the opportunity to emphasize the 
central role of memory and structure in thermodynamics. 
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